A Fast Analytical Algorithm for MDPs with Continuous State Spaces
نویسندگان
چکیده
Many real-world domains require that agents plan their future actions despite uncertainty, and that such plans deal with continuous space states, i.e. states with continuous values. While finitehorizon continuous state MDPs enable agents to address such domains, finding an optimal policy is computationally expensive. Although previous work provided approximation techniques to reduce the computational burden (particularly in the convolution process for finding optimal policies), computational costs and error incurred remain high. In contrast, we propose a new method, CPH, to solve continuous state MDPs for both finite and infinite horizons. CPH provides a fast analytical solution to the convolution process and assumes that continuous state values change according to Phase-type distributions. This assumption allows our method to approximate arbitrary probability density functions of continuous state transitions. Our experiments show that CPH achieves significant speedups compared to the Lazy Approximation algorithm, which is the leading algorithm for solving continuous state MDPs.
منابع مشابه
Continuous-action reinforcement learning with fast policy search and adaptive basis function selection
As an important approach to solving complex sequential decision problems, reinforcement learning (RL) has been widely studied in the community of artificial intelligence and machine learning. However, the generalization ability of RL is still an open problem and it is difficult for existing RL algorithms to solve Markov decision problems (MDPs) with both continuous state and action spaces. In t...
متن کاملLinearly-solvable Markov decision problems
Advances in Neural Information Processing Systems 2006 We introduce a class of MPDs which greatly simplify Reinforcement Learning. They have discrete state spaces and continuous control spaces. The controls have the effect of rescaling the transition probabilities of an underlying Markov chain. A control cost penalizing KL divergence between controlled and uncontrolled transition probabilities ...
متن کاملOnline Linear Regression and Its Application to Model-Based Reinforcement Learning
We provide a provably efficient algorithm for learning Markov Decision Processes (MDPs) with continuous state and action spaces in the online setting. Specifically, we take a model-based approach and show that a special type of online linear regression allows us to learn MDPs with (possibly kernalized) linearly parameterized dynamics. This result builds on Kearns and Singh’s work that provides ...
متن کاملA Fast Analytical Algorithm for Solving Markov Decision Processes with Real-Valued Resources
Agents often have to construct plans that obey deadlines or, more generally, resource limits for real-valued resources whose consumption can only be characterized by probability distributions, such as execution time or battery power. These planning problems can be modeled with continuous state Markov decision processes (MDPs) but existing solution methods are either inefficient or provide no gu...
متن کاملLinear Program Approximations for Factored Continuous-State Markov Decision Processes
Approximate linear programming (ALP) has emerged recently as one of the most promising methods for solving complex factored MDPs with finite state spaces. In this work we show that ALP solutions are not limited only to MDPs with finite state spaces, but that they can also be applied successfully to factored continuous-state MDPs (CMDPs). We show how one can build an ALP-based approximation for ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006